Preparations

Load the necessary libraries

library(tidyverse)  #for data wrangling etc
library(rstanarm)   #for fitting models in STAN
library(cmdstanr)   #for cmdstan
library(brms)       #for fitting models in STAN
library(standist)   #for exploring distributions
library(coda)       #for diagnostics
library(bayesplot)  #for diagnostics
library(ggmcmc)     #for MCMC diagnostics
library(DHARMa)     #for residual diagnostics
library(rstan)      #for interfacing with STAN
library(emmeans)    #for marginal means etc
library(broom)      #for tidying outputs
library(tidybayes)  #for more tidying outputs
library(HDInterval) #for HPD intervals
library(ggeffects)  #for partial plots
library(broom.mixed)#for summarising models
library(posterior)  #for posterior draws
library(ggeffects)  #for partial effects plots
library(patchwork)  #for multi-panel figures
library(bayestestR) #for ROPE
library(see)        #for some plots
theme_set(theme_grey()) #put the default ggplot theme back
source('helperFunctions.R')

Scenario

Here is a modified example from Peake and Quinn (1993). Peake and Quinn (1993) investigated the relationship between the number of individuals of invertebrates living in amongst clumps of mussels on a rocky intertidal shore and the area of those mussel clumps.

Format of peakquinn.csv data files

AREA INDIV
516.00 18
469.06 60
462.25 57
938.60 100
1357.15 48
... ...
AREA Area of mussel clump mm2 - Predictor variable
INDIV Number of individuals found within clump - Response variable

The aim of the analysis is to investigate the relationship between mussel clump area and the number of non-mussel invertebrate individuals supported in the mussel clump.

Read in the data

peake <- read_csv('../public/data/peakquinn.csv', trim_ws=TRUE)
glimpse(peake)
## Rows: 25
## Columns: 2
## $ AREA  <dbl> 516.00, 469.06, 462.25, 938.60, 1357.15, 1773.66, 1686.01, 1786.…
## $ INDIV <dbl> 18, 60, 57, 100, 48, 118, 148, 214, 225, 283, 380, 278, 338, 274…

Exploratory data analysis

Model formula: \[ \begin{align} y_i &\sim{} \mathcal{NB}(\lambda_i, \theta)\\ ln(\lambda_i) &= \beta_0 + \beta_1 ln(x_i)\\ \beta_0 & \sim\mathcal{N}(6,1.5)\\ \beta_1 & \sim\mathcal{N}(0,1)\\ \theta &\sim{} \mathcal{Exp}(1)\\ OR\\ \theta &\sim{} \mathcal{\Gamma}(2,1)\\ \end{align} \]

Fit the model

MCMC sampling diagnostics

Model validation

Fit Negative Binomial model

Partial effects plots

Model investigation

Hypothesis testing

Summary figure

References

Peake, A. J., and G. P. Quinn. 1993. “Temporal Variation in Species-Area Curves for Invertebrates in Clumps of an Intertidal Mussel.” Ecography 16: 269–77.